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ABSTRACT 

A method is presented for choosing optimal oligodeoxyribonucleotides as probes for filter hybridization, 
primers for sequencing, or primers for DNA amplification. Three main factors that determine the 
quality of a probe are considered: stability of the duplex formed between die probe and target nucleic 
acid, specificity of die probe for the intended target sequence, and self-complementarity. DNA duplex^ 
stability calculations are based on the nearest-neighbor thermodynamic values determined by Breslauer 
fe^ah[Proc. Natl Acad. Sci. (/.S/A{I986), 83: 3746]. Tenrn^raft^ of,duplex dissociation predicted > 
4ftyrthe method described here were within 0.4°C of the values pfe V/ied experimentally for ten 
xriigcnudto^ries. Calculations for specificity of the probe and its self-complententarity are based 
on a simple dynamic algorithm. 

INTRO DUCTION 

The quality of DNA* sequencing data is a function of botft the quality of mc SNTVand — 
the choice of oligonucleotide used as primer. Tne latter is especially important when 
sequencing reactions are performed under non-stringent conditions favoring formation of 
imperfect duplexes. For example, DNA sequencing with T7 DNA polymerase is performed 
at room temperature, which is usually significantly below the duplex dissociation temperature 
(T 4 ), of the primer and DNA to be sequenced. Similarly, an important step m DNA 
Jam^ chain reaction, PCR) performed ^rttji the thermps^ 

aquaticus DNA polymerase (faq polymerase) is choosing two ^oligonucleotides which are 
highly specific to the target DNA and not complementary, especially at their 3' ends. For 
example, performing PCR with two 25-nt (nucleotide) primeis which were corr^lementary 
at only the two 3'-terminal residues yielded a 48-nt product (William Spencer, The Perkin- 
Elmer Corporation, personal communication). In addition to their use in sequencing, 
oligonucleotides are commonly used as hybridization probes in screening DNA libraries. 
They are especially useful when subsets of gene families occur and in diagnosis of genetic 
diseases (reviewed in i). 

A good sequencing primer or hybridization probe should: a) form stable duplexes with 
the target sequence under the appropriate conditions; b) be highly specific for the intended 
target sequence, not base-pairing to other regions within the template; and c) not anneal 
to itself. The first requirement is especially important if the oligonucleotide probe is used 
for screening complex DNA libraries, whereas the other two are important for both 
screening and sequencing. A search for an oligonucleotide which would optimally meet 
all three of these criteria would be laborious without a computerized method. 

A critical component of any such computerized method is the algorithm for determination 
of the duplex dissociation temperature Some algorithms are based solely on the 
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length of the oligonucleotide (reviewed in 2), but are reliable only when hybridizations 
are performed in tetramethylammonium salts. A more practical method of T d 
determination was described by Suggs et al (3): 



T d « 2°C x number of AT bp + 4°C x number of GC bp. 



(0 



The most precise methods for computing helix stability, however, are based on nearest- 
neighbor thermodynamic parameters (4, 5). in the following report we describe a computer 
program, termed OUGO, which computes T d values based on nearest neighbor 
thermodynamic parameters. As a further aid to oligonucleotide selection, the program 
determines self-complementarity of the oligonucleotide, the presence of palindromes in 
tfte nucleic acid sequence, and the presence of alternative (non-target) sites for the 
oligonucleotide within the nucleic acid sequence. Experimental verification of the 
calculations is provided. 
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Rgnre 1. Algorithm used by OUGO for determining sdf^mpknHartarity of oligonucleotides. A, the i^terniirais 
of the oligonucleotide is progressively moved toward the 3'4ermimis, and at each step, a determination of the 
number of base pairs is made. If a minimum length of continuous complementarity is set to 2, base pairing found 
at step 3 and 7 will be considered significant and wi" be stored in memory. No base-pairing occurs between 
A 6 and U 9 (step 7), since the minimum hairpin loop size is 3 (Freier et a/., 1986). B, calculation of AG for 
ranfaTOatkm 7 is shown as an example. AG^ AGr, AGj. AG sym , and AG Ioop are free-energy increments, 
respectively, for left and right tenninaJ mismatch, initiation, symmetry and loop; values are given in kcal/mol. 
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METHODS 

Calculation ofT & The following expression, employing nearest-neighbor thermodynamic 
values, was adopted from Freier et al (5): 

rjs AH 

T d = 273.l5°C-f, (ii) 

AS + R x ln(C4) 



where AH and AS are the enthalpy and entropy for helix formation, respectively, R is 
molar gas constant [1.987 (cal/°C x mol)], and C is concentration of the probe. 
Thermodynamv parameters are those of Breslauer et al (4) in the case of DNA and Freier 
et al (5) for RNA. A novel constant, f, is introduced here and represents a temperature 
correction fjr filter hybridization, since the original work of Breslauer et al. (4) and Freier 
et al. (5) decermined nucleic acid melting temperatures (TJ in solution. In the 
experiments described here and by Suggs et al. (3), T d was based on filter hybridization. 
A value of 7.6°C was empirically, determined for t for the case where about 10 fmol of 
DNA is spotted on 1 mm 2 of filter area. The optimal hybridization temperature depends 
j£#lthe time for which filters are washed but is generally: 5 to 10°C lowen than TV y 
"iThe program also calculates f in ,* which is more useful for:plknhing DNA amplification 
experiments. In such a case, the value / of the equation (ii) is equal to 0; 
Determination of self-complementarity. The 5'-terminal sequence of a given oligonucleotide 
is aligned with the adjacent sequences as shown in Figure 1A. After a determination of 
base-pairing is made at the initial position (position i), the 5' terminus is repositioned 
one nt closer to the 3' terminus and the oligonucleotides are checked for base-pairing again 
(position 2). If the length of continuous duplex is equal to or higher than a pre-set minimum, 
the position and AG, calculated as shown in Figure IB, are stored in memory. The program 
stores in memory only up to 100 such positions, since it is not intended to predict secondary 
structure for large nucleic acids. 

^Determination of .complementary regions. A given oligonucleotide is aligned, with the 
5'-terminus of the nucleic acid sequence and checked for base-pairing. -Ifithe^gth 6f£ 
continuous base-pairing is longer than a pre-set value, the position number of the fragments 
and length of double-stranded region are saved in memory/The oligonucleotide is then 
repositioned one nt closer to the 3 '-terminus of the nucleic acid sequence. For PGR reactions,^ 
it is possible to check that primers are not complementary to each other with the same 
algorithm. 

Determination of palindromes. The choice of an oligonucleotide which anneals to all or 
part of a palindromic sequence for the nucleic acid in question would cause problems for 
double-stranded sequencing. The program OLIGC uses the following algorithm to find 
palindromes. A variable mm is defined as half the minimum size considered to be a 
palindrome. Beginning with position min of the nucleic acid sequence, a 'hairpin structure * 
is formed, and base-pairing between position min and min + 1 is checked. If complementarity 
is observed, positions min -I and min +2 are similarly checked. This process is continued 
until a mismatch is found. If the length of continuous base-pairing exceeds min, the position 
and length are stored in memory. The process is then repeated beginning with position 
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Tab.e I. Predicted and actual dissociation temperatures of selected oligonucleotides. 



Dissociation temperature, T d (°Q 









experi- 


predicted from: 






Length 


mental 


equation (i) 


equation (ii) 


Number 


Sequence 


(nt) 


data 3 


and error*' 


and error 


1 


GTCGAACCGGAAACCACCCCT 


21 


72 


68 (-0.7) 


73.4 ( 1.4) 


2 


GTGCCCATCTGTTCTGTAGGGG 


22 


69 


70 ( 4.3) 


69.8 ( 0.8) 


3 


TCCGGTTCGACAGTCGCC 


18 


68 


60 (-4.7) 


69:1 ( 1.1) 


4 


CTGGATATGGTTGTACAGAGCCC 


23 


67 


70 ( 6.3) 


67.0 ( 0.0) 


5 


GGAGATCAGCCGCAGGTTT 


19 


67 


60 (-3.7) 


66.6 (-0.4) 


6 


CAGCGCCACATACATCAT 


(8 


60 


54 (-2.7) 


59.3 (-0.7) 


7 


GACGCAGTCTCCTATGAG 


18 


55 


56 ( 4.3) 


53.7 (-1.3) 
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AAAGCAGTCCCATTCAT 


17 


53 


48 (-1.7) 


53.7 ( 0.7) 


9 


CAAAGGTGGAATAAACAT 


18 


52 


48 ( 0.7) 


51.5 (-0.5) 


10 


CCCAGl 1 1 l AATATTTG 


17 


48 


44 ( 0.7) 


46.9 (-1.1) 



a Experimental values of T d were derived from densitometric scanning of the autoradiogram shown in Fig. 2. 
h Error is a difference between empirical and theoretical melting temperature plus offset k, 

\ Error = T„ p - T thcor ^f J:, : - r V 

where k is the aim of differences between the empirical and theoretical values divided by. number of oligonucleotides 
tested: 

k = n (Tcxp " Tlhcur )/n 
and is equal to 3.3°C and 0°C for the equations (i) and (ii), respectively. 

m»f + 1 of the nucleic acid sequence and continued until the entire sequence ruts been checked 
for palindromes. 

Filter hybridizations and T d determination. Either 2 or 20 fmol of plasmid or phage 
lambda DNA, in a volume of 0.5 pi, were spotted on MSI Nylon filters (Micron Separations 
Inc.) and hybridizations were carried out using 5'-* 2 P-Iabe!ed oligonucleotides with a 
specific activity of 10 8 - 10* cpm/pg as previously described (6). Filters were washed 
twice at room temperature for 5 min and kept on ice-cold buffer until washed at a higher 
temperature, as specified in text, for IS min. The filters were! than subjected to 
autoradiography and the radioactivity quantitated by densitometry. The T d was defined 
as the highest temperature at which significant amount of the radioactivity was retained 
on the filter. For the 2-fmol series, a significant amount was considered to be 10—33% 
of the strongest signal in the temperature, sequence; for the 20-fmol series, this was 
considered to be 33—50% of the strongest signal. Hie T d was calculated as the average 
of the two series. 

Sequencing reactions. The single-stranded form of plasmid pTZ18R (Pharmacia) containing 
cDNA for protein synthesis initiation factor 4E was used (6). Sequencing reactions were 
performed using a modified T4 polymerase (Sequenase® , United States Biochemicals) 

Figure 2, Hybridization of oligonucleotides to the target DNA. Two series of target DNA were spotted on the 
filter, 2 fmol (spots on the left of each numbered column) and 20 frnol (right). Numbers on the top refer to 
the oligonucleotides listed in the Table I. The temp:*ature of the final wash is indicated. An autoradiogram is shown. 
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under the conditions supplied by the manufacturer. Primers were synthesized with an 
Applied Biosystems Model 380B DNA Synthesizer. 

RESULTS 

Duplex dissociation temperature. In order to compare the predictions of T d calculated from 
Equation? (i) and (ii), ten oligonucleotides ranging from 17 to 23 nt in length were used 
(Figure 2). Experimental T d was determined from the retention of radioactivity at 
increasing temperatures as described in Methods. Differences between theoretical and 
experimental data are shown in the Table I. The mean errors for equations (i) and (ii) 
were 3.0°C and 0.8°C, and the largest errors were 6.3°C and 1.4°C, respectively. Equation 
(ii) was thus approximately four times more accurate than equation (i). Aside from the 
difference in predicted vs. experimental T d values, it should be noted that experimental 
T d values reported here arc 3.3°C higher than the experimental values determined by 
Suggs et aL (3). This is likely to be due to a difference in methods of hybridization 
employed: Suggs et al (3) washed filters for increasing length;- of time, as the temperature 
was raised, since the same filters were used succesively. In the present study, separate 
filters were used for each temperature, and all filters were washed for the same length 
of time, regardless of the temperature. : k 

* Probe specificity and secondary structure. An oligonucleotide probe should be not 
complementary to nucleic acid regions other than the target sequence. Checking the entire 
nucleic acid sequence for the possible formation of stable duplexes with a given 
oligonucleotide is important if the oligonucleotide is to be used as a sequencing primer, 
as discussed above. In order to illustrate the importance of the probe specificity, two 
j$eouendn° pnW 10 fTable 0 were compare d for their abi Hty 

to prime sequencing reactions (Figure 3). Even though both oligonucleotides formed perfect 
duplexes with the test plasmid and differed by only one nt in length, oligonucleotide 6 
was strikjigly more specific as primer. Comparison of these oligonucleotides using OUGO 
revealed that oligonucleotide 6 formed a single non-target site duplex (7 nt), whereas 
oligonucleotide formed six such duplexes of up to 10 nt in length. Perhaps also significant 
S is the f&ttiuft two of the latter duplexes involved the 3 '-terminus of the^iigbnucleotide : 
At the relatively low temperature of the sequencing reaction (20°C), oligonucleotide 10 
apparently forms duplexes with these non-target sequences raulting in ifcpoot specificity 
as a sequencing primer. Neither oligonucleotide was self^mplcnrieid^^mdicat^ that 
secondary structure was not responsible for the poor results with oligonucleotide 10. 
Similarly, potential problems due to dimer formation were excluded. 

DISCUSSION 

Several methods of calculating hybridization temperatures for oligonucleotide probes are 
known. A commonly used one is that of Suggs et al. (3) r where T d calculation is based 
on the number of AT ard GC base pairs. A more precise method for determination of 
duplex-melting temperature is based on nearest-neighbor thermodynamic parameters (4, 
5). The major drawbacks of using these parameters are first, that Aey apply to solution 

Figure 3. Comparison of the two sequencing primers, digoiwcleotides 6 and 10. An autoradtograni of the seo^nong 
gel is shown on the left. On the right is a full list of the target DNA fragments complementary to the primers 
(with continuous complementarity f 7 or more nt). The possible dimer formation, self-complementarity, and 
the thennodynamic parameters calculated by OUGO are shown. 
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hybridization, and second, that calculation of melting temperature is too laborious to be 
done by hand for long sequences. The results presented here demonstrate that with a suitable 
computer program, the nearest-neighbor approach can be adapted for filter hybridization 
methods. This technique is significantly more accurate than the method based on numbers 
of AT and GC pairs (3). We found that screening phage libraries at a density of 500 plaques 
per cm 2 was specific and reproducible when the temperature of the final wash and/or 
temperature of hybridization was 5°C lower than T d calculated according to the equation 
(U). At the time that an oligonucleotide is being selected on the basis of the maximal T d , 
the program OLIGO can be used to determine its suitability based on self-complementarity, 
specificity and absence of a palindrome in the target site. 

The algorithm employed in this program finds all possible duplexes, independent of 
whether they overlap with each other, unlike several other algorithms used in determination 
of RNA secondary structure (7-9). Further analysis of the duplexes is needed to distinguish 
simple hairpin loops from bulge loops, unbalanced interior loops, pseudoknots or other 
structures. 

Oligonucleotides complementary to palindromic sequences are poor sequencing primers 
not only because they can 7 form dimers, and therefore not anneal to the target sequence 
efficiently, but also, when dsDNA is sequenced, the synthesis would occur simultaneously 
in both directions. In addition, the presence of palindromic sequences is frequently of interest 
for reasons other than sequencing. Thus it is important to check the whole sequence file 
for palindromes, this is possible with OLIGO since the algorithm is very fast. The total 
number of base comparisons is variable and depends on the sequence, but is always close 
to n+0.3xn, where n is the number of base pairs in the DNA sequence. 
— QI IG O is designed t o find optimal sequencing p rimers. PCR nrimers and h vbridigtion^ 
probes. Each of the parameters determined (duplex stability, specificity, self- 
complementarity of oligonucleotides, presence of palindromes, ability of the probes to form 
dimers) are checked independently. Selection of the probe is made by the investigator. 
The program is written in Turbo C (Borland International) for IBM PC-compatible 
computers and is available upon request. 
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